Search CORE

111 research outputs found

Stable Feature Selection for Biomarker Discovery

Author: He Zengyou
Yu Weichuan
Publication venue
Publication date: 01/01/2010
Field of study

Feature selection techniques have been used as the workhorse in biomarker discovery applications for a long time. Surprisingly, the stability of feature selection with respect to sampling variations has long been under-considered. It is only until recently that this issue has received more and more attention. In this article, we review existing stable feature selection methods for biomarker discovery using a generic hierarchal framework. We have two objectives: (1) providing an overview on this new yet fast growing topic for a convenient reference; (2) categorizing existing methods under an expandable framework for future research and development

arXiv.org e-Print Archive

CiteSeerX

Hong Kong University of Science and Technology Institutional Repository

A Combinatorial Perspective of the Protein Inference Problem

Author: He Zengyou
Yang Chao
Yu Weichuan
Publication venue
Publication date: 28/11/2012
Field of study

In a shotgun proteomics experiment, proteins are the most biologically meaningful output. The success of proteomics studies depends on the ability to accurately and efficiently identify proteins. Many methods have been proposed to facilitate the identification of proteins from the results of peptide identification. However, the relationship between protein identification and peptide identification has not been thoroughly explained before. In this paper, we are devoted to a combinatorial perspective of the protein inference problem. We employ combinatorial mathematics to calculate the conditional protein probabilities (Protein probability means the probability that a protein is correctly identified) under three assumptions, which lead to a lower bound, an upper bound and an empirical estimation of protein probabilities, respectively. The combinatorial perspective enables us to obtain a closed-form formulation for protein inference. Based on our model, we study the impact of unique peptides and degenerate peptides on protein probabilities. Here, degenerate peptides are peptides shared by at least two proteins. Meanwhile, we also study the relationship of our model with other methods such as ProteinProphet. A probability confidence interval can be calculated and used together with probability to filter the protein identification result. Our method achieves competitive results with ProteinProphet in a more efficient manner in the experiment based on two datasets of standard protein mixtures and two datasets of real samples. We name our program ProteinInfer. Its Java source code is available at http://bioinformatics.ust.hk/proteininfe

arXiv.org e-Print Archive

Hong Kong University of Science and Technology Institutional Repository

Using MicroPET Imaging in Quantitative Verification of Acupuncture Effect in Ischemia Stroke Treatment

Author: Hongtu Tang
Huafeng Liu
Jia Li
Ting Xiang
Weichuan Yu
Xiaoyan Shen
Publication venue
Publication date: 04/01/2010
Field of study

While acupuncture has survived several thousand years’ evolution of medical practice, its function still remains as a myth from the view point of modern medicine. Our goal in this paper is to quantitatively understand the function of acupuncture in ischemia stroke treatment. We carried out a comparative study using the Sprague Dawley rat animal model. We induced the focal cerebral ischemia in the rats using the middle cerebral artery occlusion (MCAO) procedure. For each rat from the real acupuncture group (n = 40), sham acupoint treatment group (n = 54), and blank control group (n = 16), we acquired 3-D FDG-microPET images at baseline, after MCAO, and after treatment (i.e., real acupuncture, sham acupoint treatment, or resting according to the group assignment), respectively. After verifying that the injured area is in the right hemisphere of the cerebral cortex in the brain by using magnetic resonance imaging(MRI) and triphenyl tetrazolium cchloride (TTC)-staining, we directly compared the glucose metabolism in the right hemisphere of each rat. We carried out t-test and permutation test on the image data. Both tests demonstrated that acupuncture had a more positive effect than non-acupoint stimulus and blank control (P < 0.025) in increasing the glucose metabolic level in the stroke-injured area in the brain, while there was no statistically significant difference between non-acupoint stimulus and blank control (P>0.15). The immediate positive effect of acupuncture over sham acupoint treatment and blank control is verified using our experiments. The long-term benefit of acupuncture needs to be further studied

Nature Precedings

Moving Object Detection by Detecting Contiguous Outliers in the Low-Rank Representation

Author: Yang Can
Yu Weichuan
Zhou Xiaowei
Publication venue
Publication date: 22/06/2012
Field of study

Object detection is a fundamental step for automated video analysis in many vision applications. Object detection in a video is usually performed by object detectors or background subtraction techniques. Often, an object detector requires manually labeled examples to train a binary classifier, while background subtraction needs a training sequence that contains no objects to build a background model. To automate the analysis, object detection without a separate training phase becomes a critical task. People have tried to tackle this task by using motion information. But existing motion-based methods are usually limited when coping with complex scenarios such as nonrigid motion and dynamic background. In this paper, we show that above challenges can be addressed in a unified framework named DEtecting Contiguous Outliers in the LOw-rank Representation (DECOLOR). This formulation integrates object detection and background learning into a single process of optimization, which can be solved by an alternating algorithm efficiently. We explain the relations between DECOLOR and other sparsity-based methods. Experiments on both simulated data and real sequences demonstrate that DECOLOR outperforms the state-of-the-art approaches and it can work effectively on a wide range of complex scenarios.Comment: 30 page

arXiv.org e-Print Archive

Hong Kong University of Science and Technology Institutional Repository

3D-Orientation Signatures with Conic Kernel Filtering for Multiple Motion Analysis

Author: Daniilidis Kostas
Sommer Gerald
Yu Weichuan
Publication venue: ScholarlyCommons
Publication date: 08/12/2001
Field of study

In this paper we propose a new 3D kernel for the recovery of 3D-orientation signatures. The kernel is a Gaussian function defined in local spherical coordinates and its Cartesian support has the shape of a truncated cone with its axis in the radial direction and very small angular support. A set of such kernels is obtained by uniformly sampling the 2D space of polar and azimuth angles. The projection of a local neighborhood on such a kernel set produces a local 3D-orientation signature. In the case of spatiotemporal analysis, such a kernel set can be applied either on the derivative space of a local neighborhood or on the local Fourier transform. The well known planes arising from single or multiple motion produce maxima in the orientation signature. Due to the kernel\u27s local support spatiotemporal signatures possess higher orientation resolution than 3D steerable filters and motion maxima can be detected and localized more accurately. We describe and show in experiments the superiority of the proposed kernels compared to Hough transformation or EM-based multiple motion detection

ScholarlyCommons@Penn

Optimization-Based Peptide Mass Fingerprinting for Protein Mixture Identification

Author: Can Yang
Chao Yang
Jason Po-Ming Tam
Robert Z Qi
Weichuan Yu
Zengyou He
Publication venue
Publication date: 08/09/2008
Field of study

*Motivation:* In current proteome research, peptide sequencing is probably the most widely used method for protein mixture identification. However, this peptide-centric method has its own disadvantages such as the immense volume of tandem Mass Spectrometry (MS) data for sequencing peptides. With the fast development of technology, it is possible to investigate other alternative techniques. Peptide Mass Fingerprinting (PMF) has been widely used to identify single purified proteins for more than 15 years. Unfortunately, this technique is less accurate than peptide sequencing method and cannot handle protein mixtures, which hampers the widespread use of PMF technique. If we can remove these limitations, PMF will become a useful tool in protein mixture identification. 
*Results:* We first formulate the problem of PMF protein mixture identification as an optimization problem. Then, we show that the use of some simple heuristics enables us to find good solutions. As a result, we obtain much better identification results than previous methods. Moreover, the result on real MS data can be comparable with that of the peptide sequencing method. Through a comprehensive simulation study, we identify a set of limiting factors that hinder the performance of PMF method in protein mixtures. We argue that it is feasible to remove these limitations and PMF can be a powerful tool in the analysis of protein mixtures

Nature Precedings

BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies

Author: Fan Xiaodan
Tang Nelson L. S.
Wan Xiang
Xue Hong
Yang Can
Yang Qiang
Yu Weichuan
Publication venue
Publication date: 01/01/2010
Field of study

Gene-gene interactions have long been recognized to be fundamentally important to understand genetic causes of complex disease traits. At present, identifying gene-gene interactions from genome-wide case-control studies is computationally and methodologically challenging. In this paper, we introduce a simple but powerful method, named `BOolean Operation based Screening and Testing'(BOOST). To discover unknown gene-gene interactions that underlie complex diseases, BOOST allows examining all pairwise interactions in genome-wide case-control studies in a remarkably fast manner. We have carried out interaction analyses on seven data sets from the Wellcome Trust Case Control Consortium (WTCCC). Each analysis took less than 60 hours on a standard 3.0 GHz desktop with 4G memory running Windows XP system. The interaction patterns identified from the type 1 diabetes data set display significant difference from those identified from the rheumatoid arthritis data set, while both data sets share a very similar hit region in the WTCCC report. BOOST has also identified many undiscovered interactions between genes in the major histocompatibility complex (MHC) region in the type 1 diabetes data set. In the coming era of large-scale interaction mapping in genome-wide case-control studies, our method can serve as a computationally and statistically useful tool.Comment: Submitte

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

PubMed Central